Discovering Knowledge from High-Dimensional Geographic Data: Integrating Visual and Computational Approaches
نویسندگان
چکیده
It has been widely recognized that spatial data analysis capabilities have not kept up with the need for analyzing the increasingly large volumes of geographic data of various themes that are currently being collected and archived (Openshaw 1991; Miller and Han 2001; Shekhar, Vatsaval et al. 2002; Guo 2003; Guo, Peuquet et al. 2003; Muntz, Barclay et al. 2003). On one hand, such a wealth of data holds great opportunities for geographers, environmental scientists, public health researchers, and others to address urgent and sophisticated geographic problems, e.g., global change, epidemics such as SARS, etc. On the other hand, existing data analysis methods fall short for the extraction of meaningful patterns from datasets of such unprecedentedly large size (in terms of the number of observations) and high dimensionality (in terms of the number of variables). Data mining and knowledge discovery refers to the overall process of discovering useful knowledge from data, which generally involves data selection, data pre-processing, data transformation, incorporation of appropriate prior knowledge, data mining, and proper interpretation of the results (Fayyad, Piatetsky-Shapiro et al. 1996). While data mining and KDD research has been widely conducted in areas of business, bioinformatics, text mining, etc., it is still at a very early stage in geographic domains. Geography is an integrative discipline and geographic data under analysis often span across multiple domains. The complexity of spatial data and geographic problems, together with intrinsic spatial relationships, constitute an enormous challenge to conventional data mining methods and call for both theoretical research and development of new techniques to assist in deriving information from large and heterogeneous spatial datasets (Han and Kamber 2001; Miller and Han 2001; Gahegan and Brodaric 2002).
منابع مشابه
Coordinating computational and visual approaches for interactive feature selection and multivariate clustering
Received: KK Revised: KK Accepted: KK Abstract Unknown (and unexpected) multivariate patterns lurking in high-dimensional datasets are often very hard to find. This paper describes a human-centered exploration environment, which incorporates a coordinated suite of computational and visualization methods to explore high-dimensional data for uncovering patterns in multivariate spaces. Specificall...
متن کاملMethods for regression analysis in high-dimensional data
By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...
متن کاملA Model for Tax Evasion Forcasting based on ID3 Algorithm and Bayesian Network
Nowadays, knowledge is a valuable and strategic source as well as an asset for evaluation and forecasting. Presenting these strategies in discovering corporate tax evasion has become an important topic today and various solutions have been proposed. In the past, various approaches to identify tax evasion and the like have been presented, but these methods have not been very accurate and the ove...
متن کاملICEAGE: Interactive Clustering and Exploration of Large and High-Dimensional Geodata
The unprecedented large size and high dimensionality of existing geographic datasets make the complex patterns that potentially lurk in the data hard to ®nd. Clustering is one of the most important techniques for geographic knowledge discovery. However, existing clustering methods have two severe drawbacks for this purpose. First, spatial clustering methods focus on the speci®c characteristics ...
متن کاملAn interactive visual testbed system for dimension reduction and clustering of large-scale high-dimensional data
Many of the modern data sets such as text and image data can be represented in high-dimensional vector spaces and have benefited from computational methods that utilize advanced computational methods. Visual analytics approaches have contributed greatly to data understanding and analysis due to their capability of leveraging humans’ ability for quick visual perception. However, visual analytics...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003